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mask work right, or other intellectual property right of Tl covering or relating to any combination, 
machine, or process in which such semiconductor products or services might be or are used. 
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About This Manual 


Preface 


Read This First 


This manual introduces the TMS320C62xx devices. The TMS320C6201 de- 
vice is the most powerful general-purpose programmable digital signal pro- 
cessor (DSP) available. The information in this manual describes the devices 
and provides a basic overview of how to use them. For more detailed informa- 
tion, see the related documentation. 


How to Use This Manual 


This document contains the following chapters: 


Chapter 1, Introduction, describes the main features of the TMS320C62xx 
devices, the history of Tl DSPs, and typical applications. 


Chapter 2, CPU Architecture, describes the architecture of the TMS320C62xx 
devices, with a block diagram and brief introduction to the parts of the device. 


Chapter 3, Memory, describes the on-chip memory and the external memory 
interface. 


Chapter 4 Peripherals, describes the peripherals available for the ’C62xx de- 
vices, such as ports, timers, direct-memory access, and power-down logic. 


Chapter 5, Development Support, describes the tools, documentation, Web 
site, and third-party support for the TMS320Céx. 


Related Documentation From Texas Instruments 


Related Documentation From Texas Instruments 


The following books describe the TMS320C62xx devices and related support 
tools. To obtain a copy of any of these TI documents, call the Texas Instru- 
ments Literature Response Center at (800) 477-8924. When ordering, please 
identify the book by its title and literature number. 


TMS320C6x Assembly Language Tools User’s Guide (literature number 
SPRU186) describes the assembly language tools (assembler, linker, 
and other tools used to develop assembly language code), assembler 
directives, macros, common object file format, and symbolic debugging 
directives for the ’C6x generation of devices. 


TMS320C62xx CPU and Instruction Set Reference Guide (literature 
number SPRU189) describes the ’C62xx CPU architecture, instruction 
set, pipeline, and interrupts for the TMS320C62xx digital signal proces- 
sors. 


TMS320C6x C Source Debugger User’s Guide (literature number 
SPRU188) tells you how to invoke the ’C6x simulator versions of the C 
source debugger interface. This book discusses various aspects of the 
debugger interface, including window management, command entry, 
code execution, data management, and breakpoints. 


TMS320 DSP Designer’s Notebook: Volume 1 (literature number 
SPRT125) presents solutions to common design problems using ’C2x, 
’C8x, ’C4x, 'C5x, and other TI DSPs. 


TMS320C6x Optimizing C Compiler User’s Guide (literature number 
SPRU187) describes the ’C6x C compiler. This C compiler accepts ANSI 
standard C source code and produces assembly language source code 
for the *C6x generation of devices. This book also describes the 
assembly optimizer, which helps you optimize your assembly code. 


TMS320C62xx Peripherals Reference Guide (literature number SPRU190) 
describes common peripherals available on the TMS320C62xx digital 
signal processors. This book includes information on the internal data 
and program memories, the external memory interface (EMIF), the host 
port, serial ports, direct memory access (DMA), clocking and phase- 
locked loop (PLL), and the power-down modes. 


TMS320C62xx Programmer’s Guide (literature number SPRU198) 
describes ways to optimize C and assembly code and includes applica- 
tion program examples. 


Related Documentation From Texas Instruments 


TMS320C6x Software Tools Getting Started Guide (literature number 
SPRU185) describes how to install the TMS320C6x assembly language 
tools, the C compiler, the simulator, and the C source debugger. Installa- 
tion instructions for SunOS™, Solaris™, Windows™ 95, and Windows 
NT™ systems are given. 


TMS320C6201 Digital Signal Processor Data Sheet (literature number 
SPRS051) describes the features of the TMS320C6xx and provides pin- 
outs, electrical specifications, and timings for the device. 
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If You Need Assistance 


If You Need Assistance... 


— World-Wide Web Sites 


TI Online http://www.ti.com 

Semiconductor Product Information Center (PIC) —_http://www.ti.com/sc/docs/pic/home.htm 
DSP Solutions http://www.ti.com/dsps 

320 Hotline On-line™ http:/www.ti.com/sc/docs/dsps/support.htm 


North America, South America, Central America 


Product Information Center (PIC) (972) 644-5580 
TI Literature Response Center U.S.A. (800) 477-8924 
Software Registration/Upgrades (214) 638-0333 Fax: (214) 638-7742 

U.S.A. Factory Repair/Hardware Upgrades (281) 274-2285 

U.S. Technical Training Organization (972) 644-5580 

DSP Hotline (281) 274-2320 Fax: (281) 274-2324 Email: dsph@ti.com 
DSP Modem BBS (281) 274-2323 

DSP Internet BBS via anonymous ftp to ftp://ftp.ti.com/mirrors/tms320bbs 


Europe, Middle East, Africa 
European Product Information Center (EPIC) Hotlines: 
Multi-Language Support +33 130 70 11 69 : +33 130701032 Email: epic@ti.com 
Deutsch +49 8161 80 33 11 or +33 1 30 70 11 68 
English +33 1 30 70 11 65 
Francais +33 13070 11 64 
Italiano +33 1 30 70 11 67 
EPIC Modem BBS +33 1 30 70 11 99 
European Factory Repair +33 4 93 22 25 40 
Europe Customer Training Helpline :+49 81 61 80 40 10 


Asia-Pacific 

Literature Response Center +852 2 956 7288 Fax: +852 2 956 2200 
Hong Kong DSP Hotline +852 2956 7268 Fax: +852 2 956 1002 
Korea DSP Hotline +82 2551 2804 Fax: +82 2551 2828 
Korea DSP Modem BBS +82 2 551 2914 

Singapore DSP Hotline Fax: +65 390 7179 
Taiwan DSP Hotline +886 23771450 Fax: +886 2 377 2718 
Taiwan DSP Modem BBS +886 2 376 2592 

Taiwan DSP Internet BBS via anonymous ftp to ftp://dsp.ee.tit.edu.tw/pub/TI/ 


Japan 

Product Information Center +0120-81-0026 (in Japan) Fax: +0120-81-0036 (in Japan) 
+03-3457-0972 or (INTL) 813-3457-0972 Fax: +03-3457-1259 or (INTL) 813-3457-1259 

DSP Hotline +03-3769-8735 or (INTL) 813-3769-8735 Fax: +03-3457-7071 or (INTL) 813-3457-7071 

DSP BBS via Nifty-Serve Type “Go TIASP” 


Documentation 
When making suggestions or reporting errors in documentation, please include the following information that is on the title 
page: the full title of the book, the publication date, and the literature number. 
Mail: Texas Instruments Incorporated Email: comments@books.sc.ti.com 
Technical Documentation Services, MS 702 
P.O. Box 1443 
Houston, Texas 77251-1443 


Note: When calling a Literature Response Center to order documentation, please specify the literature number of the 
book. 
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Classico, MicroLite, and Virtuoso Nano are trademarks of Eonic Systems, Inc. 

EVP is a trademark of D2 Technologies. 

InvisiLink is a trademark of ViaDSP, Inc. 

PC is a trademark of International Business Machines Corporation. 

Solaris, SunOS, and Sun-3 are trademarks of Sun Microsystems, Inc. 

TI, cDSP, VelociTl, and XDS510 are trademarks of Texas Instruments Incorporated. 


Windows, Windows 95, and Windows NT are registered trademarks of Microsoft Corporation. 
(Windows™, Windows™ 95, Windows NT™). 
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Chapter 1 


Introduction 


The TMS320C62xx devices feature VelociTI™, an advanced very long instruc- 
tion word (VLIW) architecture developed by Texas Instruments. VelociTI, to- 
gether with the development tool set and evaluation tools, provides faster de- 
velopment time and higher performance with increased instruction-level paral- 
lelism. 
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1.1 


Introduction to the TMS320C6x Generation 


With performance of up to 1600 million instructions per second (MIPS) anda 
complete set of development tools, the ‘C62xx devices offer cost-effective 
solutions to high-performance DSP programming challenges. The ’C6x 
development tools include a new C compiler, an Assembly optimizer that 
simplifies programming and scheduling, and a Windows-based debugger 
interface. VelociT] combines advanced VLIW architecture with a high degree 
of parallelism to produce a device that enables applications such as: 
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Unlimited Internet bandwidth 
Universal wireless communications 
Radical new telephony features 
Remote medical diagnostics 
Ultimate automated cruise control 
Personal home base station 
Personalized home security 


The ’C62xx devices also can be used for improved performance on existing 
applications, such as: 


LU OUOUO 


Wireless base stations 

Pooled modems and remote access servers 

Next-generation xDSL modems and cable modems 

Multichannel telephony platforms including central office switches, PBXs, 
and voice-messaging systems 

Multimedia systems 
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1.2 The TMS320 Family of DSPs 


1.2.1 


The TMS320 family consists of both 16-bit fixed-point and 32-bit floating-point 
devices. These DSPs possess the operational flexibility of high-speed 
controllers and the numerical capability of array processors. The following 
characteristics make this family the ideal choice for a wide range of processing 
applications: 


Very flexible instruction set 

Inherent operational flexibility 
High-speed performance 

Innovative, parallel architectural design 
Cost-effectiveness 


OOOO 


History, Development, and Advantages of TMS320 DSPs 


In 1982, Texas Instruments introduced the TMS32010 — the first fixed-point 
DSP in the TMS320 family. Before the end of the year, the Electronic Products 
magazine awarded the TMS32010 the title “Product of the Year’. The 
TMS32010 became the model for future TMS320 generations. 


Today, the TMS320 family consists of nine generations: the ’C1x, ’C2x, ’C2xx, 
*C5x, and ’C54x are fixed-point, the ’C3x and ’C4x are floating-point, the ’C8x 
is a multiprocessor, and the ’C6x will offer both fixed-point and floating-point 
devices. The first device in the 'C6x generation is the TMS320C6201, which 
is a fixed-point DSP. 


Each generation of TMS320 devices has a central processing unit (CPU) and 
a variety of on-chip memory and peripheral configurations. These spin-off de- 
vices satisfy a wide range of needs in the worldwide electronics market. When 
memory and peripherals are integrated into one processor, the overall system 
cost is greatly reduced, and circuit board space is saved. Figure 1-1 shows 
the progress of the TMS320 family of devices. 
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The TMS320 Family of DSPs 


Figure 1-1. The TMS320 Family of DSPs 
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1.2.2 Typical Applications 


The TMS320 family of DSPs offers better, more adaptable approaches to tradi- 
tional signal-processing problems, such as vocoding, filtering, and error cod- 
ing. Furthermore, the TMS320 family supports complex applications that often 
require multiple operations to be performed simultaneously. Table 1-1 shows 
many of the typical applications of the TMS320 family. 


Table 1-1. Typical Applications for the TMS320 Family 


Automotive 


Adaptive ride control 
Antiskid brakes 
Cellular telephones 
Digital radios 
Engine control 
Global positioning 
Navigation 

Vibration analysis 
Voice commands 


General-Purpose 


Adaptive filtering 
Convolution 

Correlation 

Digital filtering 

Fast Fourier transforms 
Hilbert transforms 
Waveform generation 
Windowing 


Instrumentation 


Digital filtering 
Function generation 
Pattern matching 
Phase-locked loops 
Seismic processing 
Spectrum analysis 
Transient analysis 


Consumer 


Digital radios/TVs 

Educational toys 

Music synthesizers 

Pagers 

Power tools 

Radar detectors 

Solid-state answering machines 


Graphics/Imaging 


3-D rotation 

Animation/digital maps 
Homomorphic processing 

Image compression/transmission 
Image enhancement 

Pattern recognition 

Robot vision 

Workstations 


Medical 


Diagnostic equipment 
Fetal monitoring 
Hearing aids 

Patient monitoring 
Prosthetics 
Ultrasound equipment 


Telecommunications 


1200- to 56 600-bps modems 
Adaptive equalizers 

ADPCM transcoders 

Base Stations 

Cellular telephones 

Channel multiplexing 

Data encryption 

Digital PBXs 

Digital speech interpolation (DSI) 
DTMF encoding/decoding 
Echo cancellation 


Faxing 

Future Terminals 

Line repeaters 

Personal communications 
systems (PCS) 

Personal digital assistants (PDA) 

Speaker phones 


Spread spectrum communications 


xDSL 
Video conferencing 
X.25 packet switching 


The TMS320 Family of DSPs 


Control 


Disk drive control 
Engine control 
Laser printer control 
Motor control 
Robotics control 
Servo control 


Industrial 


Numeric control 
Power-line monitoring 
Robotics 

Security access 


Military 


Image processing 

Missile guidance 
Navigation 

Radar processing 

Radio frequency modems 
Secure communications 
Sonar processing 


Voice/Speech 


Speaker verification 
Speech enhancement 
Speech recognition 
Speech synthesis 
Speech vocoding 
Text-to-speech 

Voice mail 
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Key Features of the TMS320C62xx Devices 


1.3 Key Features of the TMS320C62xx Devices 


The TMS320C62xx devices are fixed-point processors based on the ad- 
vanced VLIW CPU with eight functional units, including two multipliers and six 
arithmetic logic units. The CPU can execute up to eight instructions per cycle 
for up to ten times the performance of typical DSPs. The advanced VLIW archi- 
tecture allows designers to develop highly effective reduced instruction-set 
computer (RISC)-like code for fast development time. Features common to 
all the devices in the ’'C62xx series are listed in Table 1-2. 


Table 1-2. Key Features of the TMS320C62xx Devices 


Feature Benefit 

Advanced VLIW CPU with eight func- Executes up to eight instructions per cycle for up to ten times the 
tional units including two multipliers and performance of typical DSPs. 

six arithmetic logic units Allows designers to develop highly effective reduced instruction- 


set computer (RISC)-like code for fast development time 


Instruction packing Code size equivalence for eight instructions executed serially or in 
parallel. 
Reduces code size, program fetches, and power consumption 


100% conditional instructions Reduces costly branching 
Increases parallelism for higher sustained performance. 


Code executes as programmed on The most efficient C compiler in the industry on DSP benchmark 

highly independent functional units suite and industry’s first assembly optimizer for fast development 
time 

8-/16-/32-bit data support Efficient memory support for a variety of applications 

40-bit arithmetic options Extra precision for vocoders and other computationally intensive 
applications 

Saturation and normalization Support for key arithmetic operations 

Bit-field manipulation and instruction: Supports common operations found in control and data manipula- 

extract, set, clear, bit counting tion applications 


The first device in the family is the TMS320C6201. The early release of this 
device includes memory, the external memory interface (EMIF), direct 
memory access (DMA) with two channels, the host-port interface (HPI), and 
a flexible phase-locked loop (PLL) clock generator. The production version of 
this device also will have two enhanced-buffered serial ports and two 32-bit 
timers. 


Table 1-3 summarizes the key features of the TMS320C6201 device. 


Key Features of the TMS320C62xx Devices 


Table 1-3. Features of the TMS320C6201 


Feature 


Bit-field manipulation and instruction: 
extract, set, clear, bit counting 


1M-bit on-chip memory (512K-bit pro- 
gram, 512K-bit data) 


32-bit external memory interface sup- 
ports synchronous dynamic random ac- 
cess memory (SDRAM), synchronous 
burst static RAM (SBSRAM), and static 
RAM (SRAM) 


Two enhanced-buffered serial ports 
(EBSPs) 


16-bit host access port 


Two data memory access channels 
with boot loading capability 


Flexible PLL clock generator 


352-lead ball grid array package 


Benefit 


Supports common operation found in control and data manipula- 
tion applications 


Fast algorithm execution with fewer components per system 


High speed connections to external memory for maximum sus- 
tained performance 


Glueless interface to high-bandwidth telecommunications trunks 
Provides high-speed interprocessor communication 


Host processor access to on-chip data memory 


Efficient access to external memory/peripherals while minimizing 
CPU interrupts 


Multiplies external clock rate for two or four for maximum CPU 
performance 


Ultra-thin package minimizes board space 
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CPU 
Architecture 


The VelociTl architecture makes the ’C62xx the first off-the-shelf DSP to use 
advanced VLIW to achieve high performance through increased instruction- 
level parallelism. A traditional VLIW architecture consists of multiple execution 
units running in parallel that perform multiple instructions during a single clock 
cycle. Parallelism is the key to extremely high performance, taking these next- 
generation DSPs well beyond the performance capabilities of traditional 
superscalar designs. VelociTl is a highly deterministic architecture, with few 
restrictions on how or when instructions are fetched, executed, or stored. This 
architectural flexibility is key to the break-through efficiency levels of the ’C6x 
compiler. 
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TMS320C62xx Block Diagram 


2.1 TMS320C62xx Block Diagram 


The C62xx processor consists of three main parts - CPU (or the “core’), 
peripherals, and memory. The first device in the series, the TMS320C6201, 
is a fixed-point DSP using the VelociT| VLIW architecture. Eight functional 
units operate in parallel, with two identical sets of the basic four functional 
units. The units communicate through two register files, which each contain 
16 32-bit registers. Program parallelism is defined at compile time since there 
is no data dependency checking done in hardware during run time. The 
256-bit-wide program memory fetches eight 32-bit instructions every single 
cycle. 


Figure 2—1 shows the block diagram for the TMS320C6201 digital signal pro- 
cessor (DSP). ’C62xx DSPs are based on the ’C62xx CPU. ’C62xx devices 
come with program memory which on some devices can be used as a program 
cache. The devices also have varying sizes of data memory. Peripherals such 
as a DMA controller, power-down logic, and EMIF usually come with the CPU, 
and peripherals such as serial ports and timers are available on certain de- 
vices. Check the data sheet for your device to determine the specific peripheral 
configurations you have. 


Figure 2-1. TMS320C6201 CPU Core With Peripherals 
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2.2 Central Processing Unit (CPU) 


The ’C62xx central processing unit (CPU) is the central building block of all the 
TMS320C62xx devices. The CPU contains: 


Program fetch unit 

Instruction dispatch unit 

Instruction decode unit 

32 registers 

Two data paths, each with four functional units 
Control registers 

Control logic 

Test, emulation, and interrupt logic 
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The CPU has two data paths where processing occurs. Each data path has 
four functional units (.L, .S, .M, .D) and a register file containing 16 32-bit 
registers. The functional units execute logic, shifting, multiply, and data 
address operations. All instructions operate on the registers. The two sets of 
data-addressing units (.D1 and .D2) are exclusively responsible for all data 
transfers between the register files and the memory. 


The four functional units on each side of the CPU share the control register 
files. Each side also has a single data bus connected to registers on the other 
side of the CPU so that the units can cross—exchange data from register files 
on opposite sides. Register access across the CPU supports only one read 
and write operation per cycle. 


The two sets of functional units include the following: 


_j Two multipliers 
1 Six arithmetic logic units (ALUs) 
Lj 32 registers with 32-bit word length each 


Each functional unit is controlled by a 32-bit instruction. The instruction fetch, 
instruction dispatch, and instruction decode blocks can deliver up to eight 
32-bit instructions from the program memory to the functional units every 
cycle. The control register file provides methods to configure and control vari- 
ous aspects of processor operation. 


The VLIW processing flow begins when a 256-bit-wide instruction fetch packet 
is fetched from the internal program memory. The instructions are linked to- 
gether by the least significant bit (LSB) positions of the instruction. The instruc- 
tions linked together for simultaneous execution (up to eight in total) comprise 
an execute packet. For more details on the processing, see the 
TMS320C6201 Digital Signal Processor data sheet (literature number 
SPRS051). 


CPU Architecture 2-3 


CPU Data Paths 


The program fetch, instruction dispatch, and instruction decode units can de- 
liver up to eight 32-bit instructions from the program memory to the functional 
units every cycle. Processing occurs in each of the two data paths (A and B). 
Each data path has four functional units (.L, .S, .M, and .D) and a register file 
containing 16 32-bit registers. Each functional unit is controlled by a 32-bit 
instruction. To understand how instructions are fetched, dispatched, decoded, 
and executed in the data path, refer to the chapter on pipeline operation in the 
TMS320C62xx CPU and Instruction Set Reference Guide (literature number 
SPRU189). 


2.3 CPU Data Paths 


Figure 2-2 shows the ’C62xx CPU data paths, which consists of: 


Two general purpose register files (A and B) 

Eight functional units (.L1, .L2, .S1, .S2,.M1, .M2, .D1, and .D2, ) 
Two load-from-memory paths (LD1 and LD2) 

Two store-to-memory paths (ST1 and ST2) 

Two register file cross paths (1X and 2X) 


UOOUUOU 


2.3.1 General-Purpose Register Files 


There are two general-purpose register files (A and B) in the ‘C62xx data 
paths. Each of these files contains 16 32-bit registers (labeled AO—A15 for file 
A and BO-B15 for file B). The general purpose registers can be used for data 
or data-address pointers. Registers A1, A2, BO, B1, and B2 can be used for 
condition registers. 


2.3.2 Functional Units 
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The eight functional units in the *C62xx data paths can be divided into two 
groups of four, each of which is virtually identical for each register file. The 
functional units are described in Table 2-1. 


Most data lines in the CPU support 32-bit operands, and some support long 
(40-bit) operands. Each functional unit has its own 32-bit write port into a gen- 
eral-purpose register file. All units ending in 1 (for example, .L1) write to regis- 
ter file A and all units ending in 2 write to register file B. Each functional unit 
has two 32-bit read ports for source operands src1 and src2. Four units (.L1, 
.L2, .S1, .S2) have an extra 8-bit wide port for 40-bit long writes as well as an 
8-bit input for 40-bit long reads. Because each unit has its own 32-bit write port, 
all eight units can be used in parallel every cycle. 


CPU Data Paths 


Table 2-1. Functional Units and Descriptions 


Functional Unit Description 


.L Unit (.L1,.L2) 32/40-bit arithmetic and compare operations 
Left most 1, 0, bit counting for 32 bits 
Normalization count for 32 and 40 bits 
32-bit logical operations 


.S Unit (.S1,.S2) 32-bit arithmetic operations 
32/40-bit shifts and 32-bit bit-field operations 
32-bit logical operations, 
Branching 
Constant generation 
Register transfers to/from the control register file 


.M Unit (.M1, .M2) 16 x 16-bit multiplies 


.D Unit (.D1,.D2) 32-bit add, subtract, linear and circular address calcula- 
tion 


2.3.3 Register File Cross Paths 


Each general-purpose register file is connected to the opposite register file’s 
functional units by the 1X and 2X paths. These paths allow the .S, .M, and, .L 
units from each side to access operands from either file. 


Four units (.M1, .M2, .S1, .S2), have one 32-bit input mux selectable with either 
the same side register file (A for units ending in a 1 and B for units ending in 
a 2), or the opposite file via the cross paths (1X and 2X). The 32-bit inputs on 
the .L1 and .L2 units are both mux selectable via the cross paths. 


2.3.4 Memory, Load, and Store Paths 


There are two 32-bit paths for loading data from memory to the register file: 
one (LD1) for register file A, and one (LD2) for register file B. There are also 
two 32-bit paths, ST1 and ST2, for storing register values to memory from each 
register file. The store paths are shared with the .L and .S long read paths. 


2.3.5 Data-Address Paths 


The data-address paths (DA1 and DA2) coming out of the .D units allow data 
addresses generated from one register file to support loads and stores to 
memory from the other register file. 
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Figure 2-2. TMS320C62xx CPU Data Paths 
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Mapping Between Instructions and Functional Units 


2.4 Mapping Between Instructions and Functional Units 


Table 2-2 and Table 2—3 define the mapping between instructions and func- 
tional units. The first table lists the instructions that can be used on each func- 
tional unit. The second table lists the instructions alphabetically with the func- 
tional unit where each instruction can be used checked under the units. 


Table 2-2. Instruction to Functional Unit Mapping 


-L Unit -M Unit -S Unit -D Unit 
ABS MPY ADD ADD 
ADD SMPY ADDK ADDA 
AND ADD2 LD mem 
CMPEQ AND LD mem (15-bit offset)+ 
CMPGT B disp MV 
CMPGTU B IRPt NEG 
CMPLT B NRPt ST mem 
CMPLTU B reg ST mem (15-bit offset)+ 
LMBD CLR SUB 
MV EXT SUBA 
NEG EXTU ZERO 
NORM Mvct 
NOT MV 
OR MVK 
SADD MVKH 
SAT NEG 
SSUB NOT 
SUB OR 
SUBC SET 
XOR SHL 
ZERO SHR 

SHRU 

SSHL 

STPt 

SUB 

SUB2 

XOR 

ZERO 
t .S2 only 
+ D2 only 
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Table 2-3. Functional Unit to Instruction Mapping 


C62xx Functional Units 


Instruction .L Unit .M Unit -S Unit .D Unit 


ABS yn 
ADD al val \n 


ADDA nw 
ADDK 


B IRP 

B NRP 

B reg 
CLR 
CMPEQ 
CMPGT 
CMPGTU 
CMPLT 
CMPLTU 
EXT 
EXTU 
IDLE 

LD mem a 


LD mem wt 
(15-bit offset) 


LMBD uw 

MPY uw 
Mvct 

MV nw 

MVK 


HNBh Rk 


YXYXX YX 


YX 


\ Be 
t 


tT .S2 only 
+ D2 only 
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Table 2-3. Functional Unit to Instruction Mapping (Continued) 


Instruction 
MVKH 
NEG 
NOP 
NORM 
NOT 
OR 
SADD 
SAT 
SET 
SHL 
SHR 
SHRU 
SMPY 
SSHL 
SSUB 
ST mem 


ST mem (15- 
bit offset) 


STP 
SUB 
SUBA 
SUBC 
SUB2 
SWI 
XOR 
ZERO 


Tt .S2 only 
+ D2 only 


-L Unit 


jn 


\e \ ie 


XY \ 


C62xx Functional Units 
-M Unit -S Unit 
al 


jn 


\e 


Y YX YX 


X 


wt 


os \ 
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Addressing Modes / Interrupts 


2.5 Addressing Modes 


2.6 


Interrupts 


The addressing mode options on the C62xx are linear, circular using BKO, and 
circular using BK1. The mode is specified by the addressing-mode register 
(AMR). 


Eight registers can perform circular addressing. A4-A7 are used by the .D1 unit 
and B4-B7 are used by the .D2 unit. No other units can perform circular addres- 
sing modes. For each of these registers, the AMR specifies the addressing 
mode. 


LD(B)(H)(W), ST(B)(H)(W), ADDA(B)(H)(W), and SUBA(B)(H)(W) instruc- 
tions all use the AMR to determine what type of address calculations are per- 
formed for these registers. All registers can perform linear mode addressing. 


For more information on addressing modes, see the TMS320C62xx CPU and 
Instruction Set Reference Guide (literature number SPRU189). 


The ’C6200 CPU has 14 interrupts. These are reset, the non-maskable inter- 
rupt (NMI), and interrupts 4-15. These interrupts correspond to the RESET, 
NMI, and INT4—INT15 signals on the CPU boundary. In some ’C62xx devices 
these signals may be tied directly to pins on the device, connected to on-chip 
peripherals, or may be disabled permanently by being tied inactive on-chip. 
Generally, RESET and NMI are connected directly to pins on the device. 


For more information on interrupts, see the TMS320C62xx CPU and Instruc- 
tion Set Reference Guide (literature number SPRU189). 


Chapter 3 


Memory 


The TMS320C62xx devices come with on-chip memory that can be selected 
for use aS program memory or program cache. The device is available with 
varying sizes of data memory. When off-chip memory is used, the external 
memory interface (EMIF) can unify these spaces to a single memory space on 
most devices. 
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3.1 Memory Map 
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Figure 3-1 shows the memory map of the TMS320C6201 DSP. The total 
memory address range of the ‘C6201 is 4M bytes (corresponding to 32-bit 
internal address representation). The memory map is divided between the 
internal-program memory, internal-data memory, three external memory 
spaces, and internal-peripheral space. 


Figure 3—1. Memory Map of the TMS320C6201 


Memory Map 0 


Starting Block Size 
__ Address __ (Bytes) 
000 
0000] External-Memory Space 16M 
CEO 
100 4M 
0000} External-Memory Space 
CE1 
T 
140 64K 
0000] = Internal-Program RAM 
4M 
141 
0000 Reserved 
180 4M 
0000} Internal-Peripheral Space 
100 
4M 
0000 Reserved 
200 
0000] External-Memory Space 32M 
CE2 
400 
1984M 
0000 Reserved 
8000 0000 
64K 
Internal-Data RAM 
4M 
8001 0000 
Reserved 
8040 0000 2044M 
Reserved 


10000 
0000 


Memory Map 


Memory Map 1 


Starting Block Size 
Address (Bytes) 
000 a ae 
64K 
0000] = Internal-Program RAM 
4M 
001 
0000 Reserved 
040 16M 
0000} External-Memory Space 
CEO 
140 
0000] External-Memory Space 4M 
CE1 
180 
0000 


Same as Memory Map 0 


10000 
0000 


Memory 3-3 


Internal Memory 


3.2 Internal Memory 


The internal (on-chip) memory is organized into separate data and program 
spaces. The ’C62xx has two internal ports to access data memory, each with 
a 32-bit data and 32-bit byte-address reach. It has a single port to program 
memory, with an instruction fetch width of 256 bits and a 30-bit word (four byte) 
address, equivalent to a 32-bit byte address. 


3.2.1. Data-Memory System 


The TMS320C6201 data-memory system includes a 64K-bytes of SRAM and 
a memory controller. The TMS320C6201 CPU can access data memory in 
8-bit byte, 16-bit halfword, and 32-bit word-lengths. The data memory system 
supports two memory accesses in a cycle. These accesses can be any com- 
bination of loads and stores from the two data buses of the CPU. Similarly, a 
simultaneous internal and external memory access is supported by the data 
memory system. The TMS320C6201 data memory system also supports di- 
rect memory access (DMA) and external host accesses. For more information 
on the DMA operation, see the TMS320C62x CPU and Reference Guide. 


The data memory is organized into four banks of 16-bit wide memory. This in- 
terleaved memory organization provides a method for two simultaneous 
memory accesses. Occurring in one cycle, two simultaneous accesses to two 
different internal memory banks provide the fastest access speed. 


3.2.2 Program-Memory System 


3-4 


The TMS320C6201 program-memory system includes 64K bytes of on-chip 
SRAM and a memory/cache controller. The program memory can operate as 
either a 64K-byte internal program memory or as a directly mapped program 
cache. There are four modes under which the TMS320C6201 program 
memory system operates: 


Lj Program-memory mode 
LJ Cache-enable mode 
Lj Cache-freeze mode 
1 Cache-bypass mode 


The DMA can write data into an addressed space of program memory. The 
DMA cannot read from the internal program memory in program memory 
mode. 


When the program memory is used to cache external program data, the 
memory is no longer in valid memory space and cannot be directly addressed; 
therefore, the DMA cannot write or read the internal program memory in any 


Internal Memory 


cache mode. The caching scheme implemented in the TMS320C6201 pro- 
gram cache is a direct mapping of external program memory addresses. This 
means that any external address map to only one cache location, and ad- 
dresses which are 64K bytes apart map to the same cache location. The pro- 
gram cache is organized into 256-bit frames. Thus, each frame holds one fetch 
packet. The cache stores 2048 fetch packets. 


A program store to external memory in any cache mode first flushes the data 
inthe cache frame thatis mapped to the target address directly to ensure data 
coherency in the cache. The data then is written to the external memory at the 
addressed location. When that address is again accessed a cache miss oc- 
curs causing the new data to be loaded from external memory. 


On the change from program memory mode to cache-enabled mode, the pro- 
gram cache is flushed. During a cache freeze, the cache retains its current 
state. A program read to a frozen cache is identical to a read to an enabled 
cache with the exception that the data read from the external interface is not 
stored in the cache on a cache miss. When the cache is bypassed, any pro- 
gram read fetches data from external memory. The data is not stored in the 
cache memory. Like cache freeze, in cache bypass the cache retains its state. 
For details on cache modes, see the TMS320C62xx Peripherals Reference 
Guide 
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3.3 External Memory Interface (EMIF) 
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All external data accesses by the CPU or DMA pass through the external 
memory interface (EMIF). The EMIF is the interface between the CPU and ex- 
ternal memory such as synchronous dynamic random-access memory 
(SDRAM), synchronous-burst static RAM (SBSRAM), and asynchronous 
memory. The EMIF also provides 8-bit and 16-bit wide memory read capability 
to support low-cost boot ROM memories (flash, EEPROM, EPROM, and 
PROM). The production version of the EMIF will support higher throughput in- 
terfaces to SDRAM, including burst capability. 


The interface is programmable to adapt to a variety of setup, hold, and strobe 
widths for asynchronous devices. SBSRAM supports Zero-wait state external 
access once bursts have begun. 


In all of these types of access, the EMIF supports 8-bit, 16-bit, and 32-bit ad- 
dressability for writes. All reads are performed as 32-bit transfers. 


The EMIF can receive three types of requests for access. The three types are 
prioritized in this order: CPU data accesses, CPU program fetches, and DMA 
data accesses. When available to service another access, the EMIF services 
the request type of highest priority. For example, DMA requests are not serv- 
iced until the CPU ceases requesting external data and program fetches. 


The major functions implemented by the EMIF are the following: 


1 Steering incoming data bytes and half-words to form word data (when 
reading from byte and half-word memories) 


[1 Interfacing to the ’C6xx internal peripheral bus 


[1 Interfacing to SBSRAM and SRAM, including computation of the next ad- 
dress 


Lj Interfacing to asynchronous memories, using programmable timing 


[J Interfacing to SDRAM memories 


Lj Handshaking with internal and external modules 


The characteristics of the EMIF are as follow: 


_j Zero wait state operation, after an initial two clock cycle latency, with syn- 
chronous burst SRAM (currently available in 125 MHz speed grades). 


1 Support for little or no glue logic interface to asynchronous memory. Tim- 
ing parameters can be programmed to match various asynchronous me- 
mories. 


External Memory Interface (EMIF) 


“1 Serialization between program memory system, data memory system, 
and DMA system. 


11 Support for sharing external memory with another processor. 


[1 Support for reading byte and half-word memory devices (ROM) to facili- 
tate boot up from these low cost devices. 


The exact level of throughput to the ’C62xx is determined by the type of 
memory used, and the clock rate of the ’C62xx. For example, when the ’C62xx 
is running at 200 MHz, the maximum throughput would be 800 M-byte/second 
- assuming memory supporting that throughput. 


Figure 3-2 shows a diagram of the ‘C62xx external memory signals that are 
common to all interfaces. 


Figure 3-2. External Memory Interface (EMIF) Block Diagram 
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For more information on memory, see the TMS320C62xx CPU and Instruction 
Set Reference Guide (literature number SPRU189). For more information on 
the EMIF, see the TMS320C62xx Peripherals Reference Guide (literature 
number SPRU190). 
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Chapter 4 


Peripherals 


In addition to on-chip memory, the TMS320C62xx devices also contain several 
peripherals for communication with off-chip memory, co-processors, and seri- 
al devices. 
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Peripherals for the TMS320C62xx devices may include: 
External memory interface (EMIF) 

Direct-memory access (DMA) controller 

Host-port interface (HPI) 

Power-down logic 


Enhanced-buffered serial ports (EBSPs) 


LCoouvuo oOo vu 


32-bit timers 


The first device in the TMS320C62xx series is the TMS320C6201. The pro- 
duction release of this device will include the EBSPs supporting multivendor 
interface protocol (MVIP) and timers to allow easy algorithm integration. The 
EBSP is based on the standard TMS320C2x/C5x/C54x serial port. In addition, 
it has the ability to buffer serial samples in memory automatically with the aid 
ofthe DMA. It also has multichannel capability compatible with the T1,E1, and 
MVIP standards. 


Figure 4—1 shows the peripherals for the TMS320C62xx devices. 


Figure 4—1. Peripherals Overview 
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4.1 External Memory Interface (EMIF) 
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The EMIF is the interface between the CPU and external memory such as syn- 
chronous dynamic random-access memory (SDRAM), synchronous burst 
static RAM (SBSRAM), and asynchronous memory. The EMIF also provides 
8-bit and 16-bit wide memory read capability to support low-cost boot ROM 
memories (flash, EEPROM, EPROM, and PROM). The final revision of the 
EMIF will support higher throughput interfaces to SDRAM including burst 
capability. 


More information on the EMIF is provided in Chapter 3, Memory. 


For more information on memory, see the TMS320C62xx CPU and Instruction 
Set User’s Guide. For more information on the EMIF, see the TM@S320C62xx 
Peripherals Reference Guide. 


Direct-Memory Access (DMA) 


4.2 Direct-Memory Access (DMA) 


The on-chip DMA offers two channels which can be configured to transfer in- 
formation from one location in the memory map to another without interfering 
with the operation of the CPU. This allows interfacing to slow external memo- 
ries and peripherals without reducing the throughput to the CPU. The DMA 
controller contains its own address generators, source and destination regis- 
ters, and transfer counter. The DMA has its own bus for addresses and data. 
This keeps the data transfers between memory and peripherals from conflict- 
ing with the CPU. A DMA operation consists of a 32-bit word transfer to or from 
any of the three ‘C62xx modules (see Figure 4—2): 


Lj Internal Data Memory 


Lj Internal Program Memory that is not configured as cache as a destination 
of a transfer 


Q EMIF 


One of the channels can be used by the processor during the boot load startup 
procedure to initialize the internal program memory after reset. The DMA 
channels can be used to write to Internal program memory. 


The boot loader uses the DMA to boot load code from off-chip memory to the 
internal program memory area. An external pin (sampled at reset) selects 
whether this boot load is performed. The serial port can also be used for boot- 


ing. 


The DMA controller can access all internal program memory, all internal data 
memory, and all devices mapped to the EMIF. An exception is that the DMA 
cannot use program memory as the source of a transfer. Also, it cannot access 
memories configured as cache or memory-mapped on-chip peripheral regis- 
ters. 


The DMA controller has the following features: 


_j Two independent channels 


[1 Source and destination addresses may be within the same or different 
modules. These addresses are programmable independently, and can re- 
main constant, increment, or decrement on each transfer. 


Lj The transfer count is programmable. Once the transfer count has com- 
pleted, the DMA can send an interrupt to the CPU. 
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Figure 4—2. DMA Controller Interconnect to 'C62xx Memory Mapped Modules 
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The DMA has lowest priority to all modules it accesses and it must wait until 
no transfers are being initiated to the internal data and program memory it in- 
tends to access. DMA accesses to internal memory perform cycle stealing; 
therefore, no subsequent CPU accesses of internal memory are hampered by 
a DMA access. However, if the CPU accesses the EMIF while a multi-cycle 
DMA access is in progress, it must wait until that access completes. 


Each DMA channel has an independent set of registers that must be pro- 
grammed to control the operation of the DMA. 


See the data sheet for the specific device to find the memory mapping of DMA 
control registers. These registers are 2-bits wide and must be accessed 
through 32-bit accesses from the CPU. 


Host-Port Interface (HP!) 


4.3 Host-Port Interface (HPI) 


The host-port interface (HPI) operates as a straightforward asynchronous in- 
terface. The HPI is a 16-bit wide access port through which a host (external) 
processor can read from, and write to, the ‘C62xx’s internal data memory. 


A host processor access to the ‘C62xx internal data memory through the HPI 
consists of two operations, which follow: 


1) The host must gain control over the HPI by performing the request/ac- 
knowledge handshake through the HREQ/HACK signals. 


2) Once access has been granted, the host may perform read and write op- 
erations to the ‘C62xx internal data memory. 


The mapping of host-port address to the ‘C62xx internal memory address is 
described in the data sheet for your ‘C62xx device. 


The HPI on the production release of the TMS320C6201 will have the ability 
to boot load the CPU as well as access the full range of the ’C6201’s memory. 
Also, the HPI will offer improved performance and will be capable of operating 
without impacting CPU performance. 


For more information on the host port, see the TMS320C6xx Peripheral Refer- 
ence Guide (literature number SPRU190). 
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4.4 Power-Down Logic 
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The ’C62xx supports three power-down modes that can reduce system power 
requirements significantly. The modes are as follow: 


Lj Idle1 
Lj Idle2 
Lj Idle3 


The three lower bits of the power-down field in the control status register (CSR) 
are used to initiate the three power-down modes. If more than one of these 
power-down bits are set, the power-down mode is selected by the most signifi- 
cant bit enabled. 


When in a power-down mode, the ’C62xx can be reactivated by a reset, an en- 
abled interrupt, or any interrupt. Bits three and four of the power-down field in 
the control status register set the wake-up condition. 


The power-down mode bit and wake up bit must be set by the same instruction 
to ensure proper power-down operation. See the TMS320C6xx Peripheral 
Reference Guide (literature number SPRU190) for more details on the power- 
down logic 


Chapter 5 


Development 
Support 


The TMS320C6x design environment reflects the unique nature of the ad- 
vanced VLIW architecture. The environment includes code-generation tools, 
evaluation tools, documentation, on-line help with various tools, and a Web 
site on the Internet (www.ti.com/sc/C6x) with complete technical documenta- 
tion. 
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Code-Generation Tools 


5.1 Code-Generation Tools 
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A complete development tool set for both the PC and Sun workstations in- 
cludes the following: 


Li C Compiler 
_j Assembly optimizer 
Lj Linker 


The environment is founded on the generation’s highly advanced C compiler 
and Tl’s revolutionary assembly optimizer. Figure 5—1 shows a flow of the pro- 
cess to develop code. 


The ’C6x generation’s C compiler eliminates the need for extensive knowledge 
of DSP architecture, allowing you to take full advantage of the world’s most 
powerful DSP. This highly-structured, architecture-independent C code 
development environment dramatically reduces development time for new 
products. At the same time, it maintains the inherent performance benefits of 
the ’C6x generation’s advanced VLIW architecture. The ’C6x compiler offers 
a 3X improvement in efficiency over existing fixed-point C compilers for DSP. 


For application code sections that require the fine tuning of assembly code, the 
’C6x generation’s unique assembly optimizer provides the same transparent 
programming capability as the C compiler. The tool supports automatic sched- 
uling, optimizing, and separation of fine-grained parallel tasks from serial, in- 
line assembly code — delivering a level of simplicity and power that is unprece- 
dented in assembly-level tools. 


The tools take C or assembly source code and implement many different opti- 
mizations, including software pipelining, to intelligently find and exploit the 
unique instruction-level parallelism of the ’C6x. After each step in the process, 
the ’C6x tools allow you to evaluate their results and take appropriate steps 
to achieve the most parallel code. 


Initially, all C code — new or reused from other applications — is run through 
the C compiler for the *C6x. Using the evaluation tools described in the 
following section, you can evaluate the code for efficiency. If the performance 
is sufficient for the particular application, then the application has been 
completed, achieving the fastest possible time-to-market and incurring 
minimal engineering cost. 


Code-Generation Tools 


Figure 5—1. Code-Development Flow Chart 
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Code-Generation Tools 


A designer who needs to improve code efficiency can use intrinsics, com- 
mand-line options and source-code enhancements as a first step: 


[Lj The ’C6x design tools feature two sets of intrinsics. The first set includes 
intrinsics that perform DSP-specific operations on instructions that are not 
supported directly in C. The second set is designed to facilitate 16-bit op- 
eration on a 32-bit machine. These intrinsic functions can be invoked to 
tune the performance of the C code. 


Lj The designer can experiment with several command-line options that 
cause the compiler to perform more aggressive optimization. One particu- 
larly useful option instructs the compiler to compile an entire application 
at once, giving the compiler visibility across program sections and more 
knowledge of the way in which variable and functions are used. Another 
option causes the compiler to perform global optimizations across an en- 
tire application. 


[1 Source-code enhancements can be made to exploit specific features of 
the ’C6x architecture. For example, the ’C6x has support for operating on 
words containing two 16-bit quantities; therefore, you can utilize 32-bit 
loads and stores when operating on arrays containing 16-bit data and eas- 
ily achieve a 2X performance improvement. 


Taken together, these actions extract a large amount of parallelism from C 
code. 


For ultra-high performance applications, extracting every last bit of throughput 
from the application code may be necessary. The profiler can identify critical 
code segments that might benefit most from being generated in assembly lan- 
guage. 


For these program sections, the designer writes simple, linear ’C6x assembly 
code thatis input to Tl assembly optimizer. This assembly code is ’C6x instruc- 
tions written without concern for parallel instructions, instruction latencies, or 
register usage. 


The assembly optimizer tool schedules the instructions, taking into account 
the architectural parallelism. The tool honors ’C6x latency requirements, maxi- 
mizes parallel code, and performs register allocation. 


Evaluation Tools 


5.2 Evaluation Tools 
The evaluation tools include the following: 


_j Windows-based debugger interface 
Lj Simulator 
_j Hardware emulation board 


The ’C6x development environment provides a new intuitive Windows™- 
based graphical user interface (GUI) for debugging. The debugger interface 
features windows for source, assembly, call stack, memory, registers, and 
watch expressions as well as menu and tool bars. The debugger offers one- 
click breakpoint setting and dialogs for editing breakpoint. The debugger also 
incorporates the dynamic profiler to help users find bottlenecks and improve 
code efficiency. 


TI will provide C6201 scan-based emulation systems that support hardware 
and software debugging of target systems via a JTAG-emulation cable. Scan- 
based emulation is a unique, non-intrusive approach to system emulation, in- 
tegration, and debugging. 


Initially, Tlis offering a stand-alone C6201 test and emulation board (TEB) that 
interfaces with the host platform through the XDS510™ and XDS 510WS emu- 
lators through the IEEE Standard 1149.1 (JTAG)-compliant port. The board 
features a prototyping area for adding user-defined peripherals. With the addi- 
tion of other ’C6x generation members, TI will continually add functionality to 
the common development environment as well. Capabilities will ultimately in- 
clude a PC plug-in evaluation module (EVM) board, a low-cost PC-based 
board that is well-suited for software algorithm development. 


The dynamic profiler integrated into the ’C6x debugger creates cycle histo- 
grams that are continuously updated as the code runs. It can show graphically 
which functions, ranges and lines in an application are performance bottle- 
necks. The profiler can show: 


1 The percentage of total execution time spent in any function 
Lj The number of times a function is called 
[j Total cycles in the application, a function, or a line 


Atiming display can be built into the application by inserting a few function calls 
in the code. The resulting simple cycle counts, obtained without using the pro- 
filer or the debugger, can be printed automatically to allow you to track the 
changes in execution speed of an algorithm over time. This output, while less 
sophisticated, is continuously available with no further action. 
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5.3. Third-Party Support 


Tl has a long history of strong third-party support and this continues with the 
’C62xx devices. Table 5—1 lists some of the companies supporting the ’C62xx 
devices and the product areas. Table 5—2 lists the third-party support contacts 
with telephone numbers and electronic mail addresses. 


Table 5-1. Third-Party Support Companies and Product Area Supported 


Company 

Ariel Corporation 
Cheops GmbH & Co KG 
D2 Technologies, Inc. 
DSP Research, Inc. 


DSP Software Engineering, 
Inc. 


Eonic Systems, Inc. 


Go DSP Corporation 


HotHaus Technologies, Inc. 


Innovative Integration 


Loughborough Sound 
Images 


Pentek, Inc. 


Signals & Software Ltd. 
(SASL) 
ViaDSP, Inc. 


White Mountain DSP, Inc. 
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Product Area 

High-performance VME64 platform and computer telephony products 
Industrial and medical imaging and high speed/high resolution videoconferencing 
Embedded Voice Processing (EVP™) computer telephony software 

TIGER development boards and OEM systems 


Multi-channel V.34bis soft-modem and telecom software 


Real-time operating systems — Virtuoso Nano™, Classico™, and MicroLite™ 


Code Composer™ support and next generation development tool, Code Mae- 
stro™ 


HausWare — DSP software architecture for embedded telecommunications ap- 
plications 


PCI6201 DSP coprocessor for telecom, communications and data acquisition ap- 
plications 


PCI/C6200 — signal processing platform and PCI/C6220 telecommunications/ 
high density DSP telephony platform 


Scaleable multi-processor board for the VMEbus — model 9134 


Very high density ISP modem solution 


InvisiLink™ — line of software and firmware for high density computer telephony 
boards 


Emulation and multiplatform debug support —Mountain—510, Mountain—510/WS 
and Mountain—510/LT PCMCIA Card 


Table 5-2. Contacts for Third-Party Support 


Third-Party Support 


Third-Party Contact 


Ariel Corporation 
Cheops GmbH & Co KG 
D2 Technologies, Inc. 


DSP Research, Inc. 


DSP Software Engineering, Inc. 


Eonic Systems, Inc. 

GO DSP Corporation 

HotHaus Technologies, Inc. 
Innovative Integration, Inc. 
Loughborough Sound Images 
Pentek, Inc 

Signals & Software Ltd. (SASL) 


ViaDSP., Inc. 


Phone number 


609 860-2900 
+49 8861 2369 0 
805 564-3424 
408 773-1042 
617 275-3733 
301-572-5000 
416 599-6868 
604-278-4300 
818-865-6150 
+44 0 1509 634444 
201-818-5900 
+44 181 426 9533 
508-369-0048 
603 883-2430 


e-mail address 


ariel@ariel.com 
100541.3370@compuserve.com 
sales@d2tech.com 
info@dspr.com 

info@dspse.com 
info@eonic.com 
gdasilva@go-dsp.com 
info@hothaus.com 


techsprt@innovative-dsp.com 


info@pentek.com 
davem@sasl.demon.co.uk 
dpenny@viadsp.com 


info@wmdsp.com 


White Mountain DSP, Inc. 
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Web Site and Documentation 


5.4 Web Site and Documentation 
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Visit the Web site at www.ti.com/sc/C6x for information, an interactive 
multimedia technical overview (MeTO), documentation, and schedule of ’C6x 
design workshops. The MeTO describes features of the devices in a visual 
way with graphics in a point-and-click display for ease of navigation. The Web 
site offers a complete training schedule of design workshops and seminars. 
Applications assistance and frequently asked questions (FAQ), are also on the 
Web site. 


Documentation is available directly from the Web site in down-loadable files 
for printing. There is a complete list of documentation available in the Preface 
under Related Documentation from Texas Instruments. 


Appendix A 


Glossary 


address: Thenumber of aparticular memory or peripheral storage location. 


ALU: Arithmetic logic unit. The high-speed CPU circuit that does calculating 
and comparing. Numbers are transferred from registers into the ALU for 
calculation, and the results are sent back to a register. 


ASIC: Application—specific integrated circut. A custom chip designed for a 
specific applicaiton. It is designed by integrating standard cells from a 
library. 


Assembler: A software program that creates a machine-language program 
from a source file that contains assembly language instructions, direc- 
teives, and macro definitions. The assembler substitutes absolute oper- 
taion dcodes for symbolic operation codes . 


Assembly Optimizer: A software program that optimizes linerar assembly 
code, which is assembly code that has not been register-allocated or 
scheduled. The assembly optimizer is automatically invoked with the 
shell program, C/6x, when one of the input files has a .sa extension. 


bootloader: A built-in segment of code that transfers code from an external 
source to program memory at power up. 


clock cycles: Cycles based on the input from the external clock. 


code: Asetof instructions written to perform a task; a computer program or 
part of a program. 


CPU: Central processing unit. The unit that coordinates the functions of a 
processor. 
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Data Memory: Memory accessed through the ’C6x’s RAM interface. 


DMA: Direct memory access. Specialized circuitry that transfers data from 
memory to memory without using the CPU. 


DRAM: Dynamic random access memory. The most common type of com- 
puter memory. 


EBSP: Enhanced buffered serial ports. 
EMIF: External memory interface. 
execute packet: A packet of instructions that execute in parallel. 


external interrupt: A hardware interrupt triggered by a pin. 


fetch packet: A packet containing up to eight instructions held in memory 
for execution by the CPU. 


global interrupt enable (GIE): A bit in the control status register (CSR). 
Used to enable or disable maskable interrupts. 


hardware interrupt: An interrupt triggered through physical connections 
with on-chip peripherals or external devices. 


HPI: Host port interface 


Glossary 


interrupt: A condition caused either by an event external to the CPU or by 
a previously executed instruction that forces the current program to be 
suspended and causes the processor to execute an interrupt service 
routine corresponding to the interrupt. 


interrupt service fetch packet (ISFP): A fetch packet used to service inter- 
rupts. If 8 instructions are insufficient, the user must branch out of this 
block for additional interrupt service. If the delay slots of the branch do 
not reside within the ISFP, execution continues from execute packets in 
the next fetch packet (the next ISFP). 


ISFP: /nterrupt service fetch packet. 


IFP: Instruction fetch packet. 


latency: The delay between when a condition occurs and when the device 
reacts to the condition. Also, in a pipeline, the necessary delay between 
the execution of two instructions to ensure that the values used by the 
second instruction are correct. 


LSB: /east significant bit. The lowest order bit in a word. 


maskable interrupt: A hardware interrupt that can be enabled or disabled 
through software. 


memory interleaving: A category of techniques for increasing memory 
speed. 


MIPS: Million instructions per second. The execution speed of a computer. 


MSB: most significant bit. The highest order bit in a word. 


NMI: Non—maskable interrupt 


nonmaskable interrupt: An interrupt that can be neither masked nor dis- 
abled. 
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overflow: Acondition in which the result of an arithmetic operation exceeds 
the capacity of the register used to hold that result. 


pipeline: A method of executing instructions in an assembly-line fashion. 


pipeline processing: A category of techniques that provide simultaneous, 
or parallel, processing within the computer. It refers to overlapping 
operations by moving data or instructions into a conceptual pipe with all 
stages of the pipe processing simultaneously. 


PLL: Phase-locked loop. 


program memory: Memory accessed through the C6x’s program fetch in- 
terface. 


RAM: Random-access memory. 


register: A group of bits used for temporarily holding data or for controlling 
or specifying the status of a device. 


reset: Ameans of bringing the CPU to a known state by setting the registers 
and control bits to predetermined values and signaling execution to start 
at a specified address. 


RISC: Reduced instruction set computing. A computer architecture that re- 
duces chip complexity by using simpler instructions. 


SBSRAM: Synchronous burst static random-access memory. 
SDRAM: Synchronous dynamic random-access memory. 


shifter: A hardware unit that shifts bits in a word to the left or to the right. 


Glossary 


VelociTl: Architecture developed by TI that features very long instruction 
words 


VLIW: Very long instruction word. 
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Addressing modes, 2-10 
Addressing-mode register, 2-10 
ALU. See arithmetic logic unit 
AMR. See addressing-mode register 
Applications 

*C6x, 1-2 

TMS320 family, 1-4 
Applications of DSPs, 1-4 
Architecture 


CPU, 2-1, 2-3 
VelociTl, 1-2, 2-1 
VLIW, 2-1 


Arithmetic logic unit, 2-3 


Block Diagram, 2-2 


Central processing unit, 1-3 
Architecture, 2-3—-2-6 
Core with peripherals, 2-2 
Data path figure, 2-6 
Data-address paths, 2-5 
Functional units, 2-3, 2-4 
Load and store paths, 2-5 
Memory paths, 2-5 
Register, 2-4 
Register files, 2-4 

Code Development Flow Chart, 5-3 
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Code-generation tools, 5-2 

CPU. See central processing unit 
Cross paths, CPU, 2-5 

CSR. See control status register 


Data path figure, 2-6 
Data paths, CPU, 2-4 
Development support, 5-2 
Development Tools 

C Compiler, 5-3 

Code Development Flow, 5-3 
Direct Memory Access, 4-5 
Direct memory access, 4-2 


DMA. See Direct memory access; direct memory 
access 

DMA controller, 4-5 

DSPs 
Applications, 1-4 
History, 1-3 


EBSP. See enhanced buffered serial ports 
EMIF, 3-1 
See also external memory interface 
Block diagram, 3-7 
Enhanced buffered serial ports, 1-6, 1-7, 4-2 
Evaluation tools, 5-5 
EVM. See evaluation module 
External memory, 3-6, 4-4 
External memory interface, 4-2 
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Features of the ’C62xx, 1-6 
Features of the TMS320C6201, 1-7 


Functional units, 2-3 
.D, 2-5 
.L, 2-5 
.M, 2-5 
S, 2-5 
Cross paths, 2-5 
Descriptions, Table, 2-5 
Mapping, instructions, 2-7 


Graphical user interface, 5-5 
GUI. See graphical user interface 


Host-port interface, 2-2, 4-2, 4-7 
HPI. See host-port interface 


Idle modes, 4-8 

IFP. See instruction fetch packet 

Instruction fetch packet, 2-3 

Instruction set, 2-7 

Instruction to functional unit mapping, 2-8, 2-9 
Instructional to functional unit mapping, 2-7 
Internal memory, 3-4 

Interrupts, 2-10 

Introduction, TMS320 family overview, 1-3 


Least significant bit, 2-3 
Load and store paths, CPU, 2-5 
LSB. See least significant bit 
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Mapping, Instruction to functional unit, 2-7, 2-8, 2-9 


Memory, 3-1 

EMIF, 3-6, 4-4 

External, 3-6 

Internal, 3-4 

Program, 3-4 
Memory map, 3-2, 3-3 
Million instructions per second, 1-2 
MIPS. See Million instructions per second 
Multivendor interface protocol (MVIP), 4-2 
MVIP. See Multivendor interface protocol 


NMI. See Non-maskable interrupt 
Non-maskable interrupt, 2-10 


Overview, TMS320 family, 1-3 


Peripherals overview, 4-3 
Power-down logic, 4-8 
Program memory, 3-4 


Reduced instruction set computer, 1-6, 1-7 
Register files, CPU, 2-4 

Register paths, Cross paths, 2-5 

RISC. See Reduced instruction set computer 


SBSRAM. See synchronous burst static RAM 
SDRAM. See synchronous dynamic RAM 
Synchronous burst static RAM, 1-7, 3-6 
Synchronous dynamic RAM, 1-7 


Third-party support, 5-6 
Contacts, 5-7 
Product area, 5-6 
TMS320, Introduction to the ’C6x, 1-2 
TMS320 DSPs, Applications table, 1-5 
TMS320 family 
Advantages, 1-3 
Applications, 1-4—1-5 
Characteristics, 1-3 
Development, 1-3 
History, 1-3 
Overview, 1-3 
TMS320C6201, 1-2 
TMS320C6201 CPU Core With Peripherals, 2-2 
TMS320C62xx, Peripherals, 4-2 
Tool set, 5-2 


Index 


Tools 
Assembly optimizer, 5-2 
C Compiler, 5-2 
Debugger, 5-5 
Evaluation, 5-5 
Hardware emulation board, 5-5 
Linker, 5-2 
Simulator, 5-5 


Using the C Compiler, 5-3 


VelociTl, 1-2, 2-1 
Very-long instruction word, 1-2 
VLIW. See Very-long instruction word 
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IMPORTANT NOTICE 


Texas Instruments (Tl) reserves the right to make changes to its products or to discontinue any semiconductor 
product or service without notice, and advises its customers to obtain the latest version of relevant information 
to verify, before placing orders, that the information being relied on is current. 
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the time of sale in accordance with Tl’s standard warranty. Testing and other quality control techniques are 
utilized to the extent Tl deems necessary to support this warranty. Specific testing of all parameters of each 
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In order to minimize risks associated with the customer’s applications, adequate design and operating 
safeguards should be provided by the customer to minimize inherent or procedural hazards. 


Tl assumes no liability for applications assistance, customer product design, software performance, or 
infringement of patents or services described herein. Nor does TI warrant or represent that any license, either 
express or implied, is granted under any patent right, copyright, mask work right, or other intellectual property 
right of TI covering or relating to any combination, machine, or process in which such semiconductor products 
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